{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "(auto-office)=\n",
    "# The Automated Office\n",
    "\n",
    "In this chapter, we'll look at a range of ways to automate processes and tasks you might need to undertake in an office context.\n",
    "\n",
    "Let's import a few of the packages we'll need first. You may need to install some of these; the Chapter on {ref}`code-preliminaries` covers how to install new packages."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from rich import inspect\n",
    "import matplotlib.pyplot as plt\n",
    "import warnings\n",
    "\n",
    "# Plot settings\n",
    "plt.style.use(\n",
    "    \"https://github.com/aeturrell/coding-for-economists/raw/main/plot_style.txt\"\n",
    ")\n",
    "\n",
    "# Pandas: Set max rows displayed for readability\n",
    "pd.set_option(\"display.max_rows\", 8)\n",
    "\n",
    "# Set seed for random numbers\n",
    "seed_for_prng = 78557\n",
    "prng = np.random.default_rng(seed_for_prng)  # prng=probabilistic random number generator\n",
    "# Turn off warnings\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Files\n",
    "\n",
    "Python is sometimes thought of as a 'glue' language because it can glue together lots of different functionalities (including calling other languages). The ins and outs of your operating system are no different.\n",
    "\n",
    "The single most important module for manipulating files in Python is `os`, which interacts with your operating system and is built-in to Python (so no need for a separate install). Let's start by getting the current working directory (`getcwd()`) for the kernel (this will be whatever computer the code is being run on)."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "\n",
    "os.getcwd()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`os` can be used to create files and directories, for example `os.mkdir()` creates a new directory (but throws an error if it already exists). There are also commands to remove files, which should of course be used with care!\n",
    "\n",
    "One particularly useful `os` method is `stat(path).st_size`, which returns the size of the file from a given path. To get a bit meta, we can use it to query the size of the page you're currently reading."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Size in bytes\n",
    "print(f\"The current page is {os.stat('auto-office.ipynb').st_size/1e3} kilobytes.\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Another command you should be aware of is `os.chdir(path)` which, when used, changes the working path of your code. To see the contents of the directory that your interactive window is currently in, use `os.listdir()`. Here's an example of using it, though we'll only show the first five files:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "os.listdir()[:5]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "`shutil` is another handy file-manipulation module built-in to Python. It has `copyfile` and `move` functions, which do exactly what you'd expect. You can find more information on organising files and folders in an automated way over in [Chapter 9](https://automatetheboringstuff.com/chapter9/) of *Automate the Boring Stuff With Python*."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "[**watchdog**](https://pythonhosted.org/watchdog) is a library that allows you to monitor files on a computer for changes, and to log changes to a text file when they do occur. This can be useful in a production setting, or for monitoring changes in files on a connected network drive."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Downloading Files\n",
    "\n",
    "Downloading files programmatically and repeatably is possible using the `urllib` library, which comes built-in with Python. Here's an example of how to use it to download a file and give it a specific name:\n",
    "\n",
    "```python\n",
    "\n",
    "import urllib.request\n",
    "\n",
    "url = \"https://assets.publishing.service.gov.uk/government/uploads/system/uploads/attachment_data/file/1031268/NIC_Annual_Report_and_Accounts_2020_to_2021_Final_4_November.pdf\"\n",
    "\n",
    "urllib.request.urlretrieve(url, \"nic_ann_rep.pdf\")\n",
    "```\n",
    "\n",
    "You can also download and unzip files in one fell swoop.\n",
    "\n",
    "```python\n",
    "from io import BytesIO\n",
    "from urllib.request import urlopen\n",
    "from zipfile import ZipFile\n",
    "\n",
    "# URL of the zip file\n",
    "zipurl = \"https://files.stlouisfed.org/files/htdocs/uploads/FRED-QD%20Appendix.zip\"\n",
    "\n",
    "# extract to path\n",
    "extract_to = \"downloads/\"\n",
    "\n",
    "zipfile = ZipFile(BytesIO(urlopen(url).read()))\n",
    "zipfile.extractall(path=extract_to)\n",
    "```\n",
    "\n",
    "If you don't want to actually save the files on your computer, you can still look at the contents of them:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from io import BytesIO\n",
    "from urllib.request import urlopen\n",
    "from zipfile import ZipFile\n",
    "\n",
    "# URL of the zip file\n",
    "zipurl = \"https://files.stlouisfed.org/files/htdocs/uploads/FRED-QD%20Appendix.zip\"\n",
    "\n",
    "# Take a look at the contents\n",
    "with urlopen(zipurl) as zipresp:\n",
    "    with ZipFile(BytesIO(zipresp.read())) as zfile:\n",
    "        print(\"\\n\".join(zfile.namelist()))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Links and Websites\n",
    "\n",
    "### Checking Dead Links\n",
    "\n",
    "Let's say you've created a new website, perhaps using the [Quarto tool](https://quarto.org/docs/websites/) that's featured in the Chapter on {ref}`auto-reports`. When you wrote it, all the links worked fine! But the internet shifts and changes (one reason why PDFs are under-rated...), and there's no guarantee that the links that used to work still will.\n",
    "\n",
    "Fortunately, there are command line tools like the Python package [**deadlinks**](https://github.com/butuzov/deadlinks) out there that can check your links programmatically to see if any are down. Although **deadlinks** is a Python package, it's set up to work as a *command line* tool, so the syntax to use it is\n",
    "\n",
    "```bash\n",
    "deadlinks https://your-webpage-here.html\n",
    "```\n",
    "\n",
    "in the terminal. You will need to install **deadlinks** via pip."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Websites\n",
    "\n",
    "You might be surprised to know you can programmatically open up browser windows, navigate around, and do anything you'd reasonably do as a user. In fact, this technique is often used for web scraping, a technique to programmatically obtain information from websites. We've seen a bit of web scraping already in {ref}`data-extraction`, so here let's focus on simply opening websites programmatically.\n",
    "\n",
    "The module that lets you do this is called **webbrowser**, and it's built-in to the standard Python library (no need to instally anything but Python itself)."
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "```python\n",
    "import webbrowser\n",
    "\n",
    "url = 'https://docs.python.org/'\n",
    "\n",
    "# Open URL in a new tab, if a browser window is already open.\n",
    "webbrowser.open_new_tab(url)\n",
    "\n",
    "# Open URL in new window, raising the window if possible.\n",
    "webbrowser.open_new(url)\n",
    "```"
   ]
  }
 ],
 "metadata": {
  "interpreter": {
   "hash": "c4570b151692b3082981c89d172815ada9960dee4eb0bedb37dc10c95601d3bd"
  },
  "kernelspec": {
   "display_name": "Python 3.8.12 64-bit ('codeforecon': conda)",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.12"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
